feat: Add Chain Data tables #63

juan518munoz · 2023-12-13T20:50:17Z

from #5

Add block_headers and chain_mmr_nodes tables with their needed insertion and getter methods to interact with the Store.

src/client/mod.rs

igamigo · 2023-12-14T21:54:12Z

src/store/store.sql

+-- Create chain mmr nodes
+CREATE TABLE chain_mmr_nodes(
+    id INTEGER,          -- in-order index of the internal MMR node (autoincrements)
+    node BLOB NOT NULL,  -- internal node value (hash)
+    PRIMARY KEY (id)
+)


@bobbinth you mentioned that we would probably need to refactor ChainMmr in Miden base to have it work on a partial MMR (I'm guessing this TODO is related).
Right now the RPC returns partial MMR deltas AFAIK, and it should be enough to maintain a full MMR on the client side by adding new leaves, correct? Can you expand a little bit more on the motivation and how it could work?

I believe depending on how frequently the client makes requests, we may not get all the leaves and the client won't be able to build the full MMR.

There is an issue to fix this 0xPolygonMiden/miden-base#112 - I'll work on this and it'll be ready by Monday.

bobbinth

Thank you! I left a few comments inline explaining in more detail how the overall structure should work. It may require a bit more explanation though - so, let happy to talk through it.

src/store/store.sql

src/store/chain_data.rs

bobbinth · 2023-12-20T10:41:15Z

src/store/chain_data.rs

+impl Store {
+    // CHAIN DATA
+    // --------------------------------------------------------------------------------------------
+    pub fn insert_block_header(&mut self, block_header: BlockHeader) -> Result<(), StoreError> {


As mentioned in the previous comments, this will need to take an additional parameter for chain_mmr_peaks (this would be Vec<Digest>).

Now thinking about this, we might also need to pass the forest (u64) to this as well because our goal is to be able to reconstruct PartialMmr that the client keeps track of. This requires:

forest: u64 - this would be in the block_headers table.

peaks: Vec<RpoDigest> - this would also be in the block_headers table.

nodes: BTreeMap<InOrderIndex, RpoDigest> - this would come from the chain_mmr_nodes table.

track_latest: bool- not sure where this should come from yet.

Is this still needed? forest can be derived from peaks and the PartialMmr constructor takes only the peaks

bobbinth · 2023-12-20T10:55:14Z

src/store/chain_data.rs

+    pub fn insert_chain_mmr_node(&mut self, chain_mmr: ChainMmr) -> Result<(), StoreError> {
+        let node = serialize_chain_mmr(chain_mmr)?;
+
+        const QUERY: &str = "INSERT INTO chain_mmr_nodes (node) VALUES (?)";
+
+        self.db
+            .execute(QUERY, params![node])
+            .map_err(StoreError::QueryError)
+            .map(|_| ())
+    }
+
+    pub fn get_chain_mmr_hash_by_id(&self, id: u64) -> Result<ChainMmr, StoreError> {
+        const QUERY: &str = "SELECT id, node FROM chain_mmr_nodes WHERE id = ?";
+        self.db
+            .prepare(QUERY)
+            .map_err(StoreError::QueryError)?
+            .query_map(params![id as i64], parse_chain_mmr_nodes_columns)
+            .map_err(StoreError::QueryError)?
+            .map(|result| {
+                result
+                    .map_err(StoreError::ColumnParsingError)
+                    .and_then(parse_chain_mmr_nodes)
+            })
+            .next()
+            .ok_or(StoreError::ChainMmrNodeNotFound(id))?
+    }


Not sure these methods are correct. chain_mmr_nodes table is meant to store nodes from PartialMmr struct. Basically, every row in this table would be a single entry in the nodes map of PartialMmr.

We probably don't want to insert the whole partial MMR every time - but rather only insert the resulting from each update.

So, the methods here should probably be a bit lower-level. Something like:

pub fn insert_chain_mmr_nodes(&mut self, nodes: Vec<(InOrderIndex, Digest)>) -> Result<(), StoreError>` { } /// Returns all nodes in the table. pub fn get_chain_mmr_nodes(&mut self) -> Result<BTreeMap<InOrderIndex, Digest>, StoreError> { } /// Gets a list of nodes required to reconstruct authentication paths for the specified blocks. /// /// This will be used for `get_transaction_data()` method of `DataStore`. pub fn get_chain_mmr_paths( &mut self, block_numbers: &[u32] ) -> Result<Vec<(InOrderIndex, Digest)>, StoreError> { }

So, to check that I undestand correctly:

insert_chain_mmr_nodes should receive only the nodes derived from each update, iterating over them, inserting each pair (InOrderIndex, Digest) in the table.

get_chain_mmr_nodes retrieves all rows from the table, and puts them inside a BTreeMap.

get_chain_mmr_paths only retrieves the rows on the table which InOrderIndex matches any of the elements of block_numbers.

insert_chain_mmr_nodes should receive only the nodes derived from each update, iterating over them, inserting each pair (InOrderIndex, Digest) in the table.

This is correct. In the naive implementation this should be pretty simple as we can just take everything that was added to the nodes map after the last inserted node (new nodes would always have a bigger index).

get_chain_mmr_nodes retrieves all rows from the table, and puts them inside a BTreeMap.

Correct.

get_chain_mmr_paths only retrieves the rows on the table which InOrderIndex matches any of the elements of block_numbers.

It is a bit more involved as for each block number we'll need to return all nodes in the path from the block to the root of the corresponding peak. Let's leave this for another PR.

Looking at the implementation of InOrderIndex it seems like there's no way to serialize this type, and neither access it's inner usize, am I missing something?

Yes, currently missing but I'm adding it in 0xPolygonMiden/crypto#238.

src/store/mod.rs

src/client/chain_data.rs

bobbinth · 2023-12-20T11:14:08Z

src/client/mod.rs

        self.store
-            .apply_state_sync(new_block_num, new_nullifiers)
+            .apply_state_sync(new_block_num, new_nullifiers, new_block_header)
            .map_err(ClientError::StoreError)?;


A general comment about sync_state function: my original intent was for it to work slightly differently. Specifically:

The sync_state request to the node gives us the next block containing requested data. It also gives us chain_tip which is the latest block number in the chain. So, unless response.block_header.block_num == response.chain_tip we haven't synced to the tip of the chain yet.

The idea was that we'd make these requests in a loop until response.block_header.block_num == response.chain_tip, at which point we know that we've fully synchronized with the chain.

Each request also brings us info about new notes, nullifiers etc. created. It also returns Chain MMR delta that we can use to update the state of Chain MMR. This includes both chain MMR peaks and chain MMR nodes.

A naive way to update chain MMR is to load full PartialMmr at the beginning of this method and then call apply on it for every response. There is a better way to do it - though, the details require more thought.

We can add the naive implementation to leave it in a working state, and change it a better one if we can come up with some.

bobbinth · 2023-12-20T20:57:34Z

A few more comments about the overall syncing procedure (some applicable to this PR, but many probably for future PRs):

First, as already mentioned in one of the comments above, the idea is that we make requests to the sync_state endpoint until response.chain_tip == response.block_header.block_num. In the requests, we send:

block_num from state_sync table.
A list of all unique account IDs from the accounts table.
A list of tags from the state_sync table.
A list of nullifier prefixes for all un-consumed input notes.

For every response we get, we do the following:

Update block_num in the state_sync table to response.block_header.block_num.
Check if the returned account hashes match latest account hashes in the database. If they don't match, something got corrupted and we won't be able to execute transactions against accounts where there is a state mismatch.
For any consumed nullifiers update corresponding input notes. This also implies that transactions in which these notes were created have also been committed and thus we need to update their state and states of involved accounts accordingly.
Update input notes table based on the returned notes. Here, we'll assume that we already have most of the note's details in the table (these notes could be imported previously via a side channel or created locally). But these notes would be missing anchor info (e.g., location in the chain and inclusion path). So, basically, for every returned note:
a. We look up a note record by note hash in input_notes table. If no note is found, we just move to the next returned note.
b. If a note is found, we update it's anchor info. This will make this note consumable because now we build the inclusion proof for the note (which is required to execute a transaction).
If the response brought back any relevant notes (e.g., the ones that were not ignored in the previous step), we also need to update our chain data tables. Specifically, we need to insert a new block header (from response.block_header) and also update chain_mmr_nodes table. The simplest way to do this is to maintain in memory representation of PartialMmr struct which contains info from these tables.

igamigo · 2023-12-22T03:40:17Z

FYI, I'm handling some of Bobbin's comments on the syncing procedure on #68 (except the account hash checks and maintaining the MMR which are more related to this PR)

igamigo · 2023-12-26T20:16:30Z

Pushed some changes that use the newer changes of the crypto crate (applying the delta now returns the new nodes and we can use that to insert to the table). Further changes to the sync state procedure are left for future PRs (some of them in #68).

bobbinth

Looks good! Thank you!

src/store/accounts.rs

add block headers & chain mmr nodes table

26784de

juan518munoz marked this pull request as ready for review December 14, 2023 12:54

juan518munoz added 4 commits December 14, 2023 14:53

update chain state on sync

3174eff

modularize chain data

fd028cc

remove prints

4385338

clippy

f309a94

igamigo requested a review from bobbinth December 14, 2023 20:40

igamigo reviewed Dec 14, 2023

View reviewed changes

juan518munoz added 4 commits December 15, 2023 09:39

fixes

e75b651

change block response

bd7fd25

Merge branch 'main'

8d0f1d9

clippy

6c09d46

bobbinth requested changes Dec 20, 2023

View reviewed changes

juan518munoz added 2 commits December 20, 2023 09:55

Merge branch 'main'

2277a31

partial review

fd63d83

juan518munoz and others added 6 commits December 21, 2023 17:02

review continued

9d21f6f

add bobbin comments for guidance

15cced4

Add datastore and improve state sync

5f8fece

Remove binary proof

3c51fa2

Split changes for another PR

f9e4bf4

fix test

7bc9e25

juan518munoz and others added 7 commits December 22, 2023 09:31

clippy

0538ac0

Merge branch 'igamigo-impl-data-store'

9897981

sync loop

c478f5f

check acc hashes & apply delta

4fb3716

Merge branch 'main'

af741fe

check accounts

3118750

update chain mmr insertion after crypto crate update (#8)

4fa5f95

igamigo requested a review from bobbinth December 26, 2023 20:13

bobbinth approved these changes Dec 27, 2023

View reviewed changes

src/store/accounts.rs Outdated Show resolved Hide resolved

igamigo added 2 commits December 29, 2023 14:39

make function private

8c1bc03

pub(crate)

18e57c6

igamigo merged commit 653b3a9 into 0xPolygonMiden:main Dec 29, 2023
4 checks passed

bobbinth mentioned this pull request Jan 10, 2024

Separated docs by components, added API docs 0xPolygonMiden/miden-node#136

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Chain Data tables #63

feat: Add Chain Data tables #63

juan518munoz commented Dec 13, 2023

igamigo Dec 14, 2023

bobbinth Dec 15, 2023

bobbinth left a comment

bobbinth Dec 20, 2023

bobbinth Dec 20, 2023 •

edited

Loading

igamigo Dec 26, 2023

bobbinth Dec 20, 2023

juan518munoz Dec 20, 2023

bobbinth Dec 20, 2023

juan518munoz Dec 20, 2023 •

edited

Loading

bobbinth Dec 20, 2023

bobbinth Dec 20, 2023

juan518munoz Dec 20, 2023 •

edited

Loading

bobbinth commented Dec 20, 2023

igamigo commented Dec 22, 2023

igamigo commented Dec 26, 2023

bobbinth left a comment

feat: Add Chain Data tables #63

feat: Add Chain Data tables #63

Conversation

juan518munoz commented Dec 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobbinth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobbinth Dec 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juan518munoz Dec 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juan518munoz Dec 20, 2023 • edited Loading

Choose a reason for hiding this comment

bobbinth commented Dec 20, 2023

igamigo commented Dec 22, 2023

igamigo commented Dec 26, 2023

bobbinth left a comment

Choose a reason for hiding this comment

bobbinth Dec 20, 2023 •

edited

Loading

juan518munoz Dec 20, 2023 •

edited

Loading

juan518munoz Dec 20, 2023 •

edited

Loading